Deficit of homozygosity among 1.52 million individuals and genetic causes of recessive lethality

Genotypes causing pregnancy loss and perinatal mortality are depleted among living individuals and are therefore difficult to find. To explore genetic causes of recessive lethality, we searched for sequence variants with deficit of homozygosity among 1.52 million individuals from six European populations. In this study, we identified 25 genes harboring protein-altering sequence variants with a strong deficit of homozygosity (10% or less of predicted homozygotes). Sequence variants in 12 of the genes cause Mendelian disease under a recessive mode of inheritance, two under a dominant mode, but variants in the remaining 11 have not been reported to cause disease. Sequence variants with a strong deficit of homozygosity are over-represented among genes essential for growth of human cell lines and genes orthologous to mouse genes known to affect viability. The function of these genes gives insight into the genetics of intrauterine lethality. We also identified 1077 genes with homozygous predicted loss-of-function genotypes not previously described, bringing the total set of genes completely knocked out in humans to 4785.


Incomplete homozygous deficit
In addition to genes where we observed 10% or less of predicted homozygotes, we also examined variants in genes with a less pronounced deficit (Supplementary Data 4 and 6), with the aim to detect sequence variants in known or novel disease genes that have a pathological impact on individuals primarily after birth.
The F508del-CFTR variant causes autosomal recessive cystic fibrosis, which is the most common lethal monogenic disease in Europeans 9 , and is present in~90% of cystic fibrosis cases 10 . In the combined set, we observe an incomplete homozygous deficit (59%) for the      Taken together, we observe a pronounced homozygous deficit of p.Arg185Ter in ATP5PB in Iceland, but we do not observe evidence for an excess of reported miscarriage among carrier couples despite substantial power to detect a difference. This suggests that homozygosity of ATP5PB:p.Arg185Ter leads to early embryonic lethality.

CCDC59
In our combined population set, a loss-of-function variant c.561_564+4del in CCDC59 has a homozygous deficit of 19 (expected homozygotes = 20.7, observed homozygotes = 0) ( Table 1). The CCDC59:c.561_564+4del variant is an eight-base pair deletion that is predicted to disrupt the splice donor site at the end of exon 3 of the CCDC59 gene sequence. The frequency of this variant is similar in all of the six populations (0.23% to 0.48%). Homozygous CCDC59 knockout mice have been reported to show early embryonic lethality before oranogenesis 7 and, CCDC59 is essential for proliferation in human cell lines. Consequently, we hypothesize that CCDC59 is a developmental lethal gene 6 . However, the CCDC59 gene has not been reported to cause disease in humans 41 . The CCDC59 gene encodes TAP26, which is a transcription factor that plays a role in surfactant metabolism 42 .
In Iceland, we identified seven heterozygous carrier couples of the CCDC59 splice donor variant. We do not observe a history of early lethality or indication of severe congenital conditions among the 16 offspring. Furthermore, we neither observe an excess of reported miscarriage in the set of 22 pregnancies among the seven carrier couples (n miscarriages /n pregnancies = 5/22) (Supplementary Data 19). Among the nine genotyped offspring of carrier couples, (five are heterozygotes, and four are homozygous for the reference allele).
Taken together, the data are consistent with the hypothesis that, as in mice, homozygosity for CCDC59 loss of function causes embryonic lethality in humans.

MRPS30
The missense variant p.Ile233Arg in MRPS30 shows a significant homozygous deficit (expected homozygotes = 48.2, observed homozygotes = 1) ( Table 1). This variant has a similar allele frequency in the six populations studied (0.34 -1.05%). Notably, we observe a single homozygote that was detected in Denmark. The MRPS30 gene has not been knocked out in mice, but it is essential for proliferation in human cell lines 6 . Since cell essential genes are effectively a subset of mouse-lethal genes, we designate MRPS30 as a plausible candidate developmental lethal gene 6 ( Table 1). MRPS30, also known as programmed cell death 9 (PDCD9), is a nuclear-encoded mitochondrial ribosomal protein S30. Little is known about the role of nuclear-encoded mitochondrial ribosomal proteins during mammalian embryogenesis, but results from mouse knockout studies indicate a role in the initiation of gastrulation 43  In the Icelandic population, we observe four couples that are heterozygous for the p.Ile233Arg variant in MRPS30. We do not observe a history of early lethality or information indicative of severe congenital conditions among seven offspring. Furthermore, we do not observe a significantly higher incidence of miscarriage in carrier couples.

BRF2
We detect a significant deficit of homozygous genotypes in Iceland for the splice donor BRF2 (also known as TFIIIB50) encodes a subunit of the BRF2-TFIIIB complex, which is involved in the recruitment of RNA polymerase III to genes with type 3 promoters that transcribe short, abundant nonprotein-coding RNA transcripts that have key functional roles, particularly in the protein synthesis apparatus 45 .
BRF2 is an oncogene, and increased gene expression levels have been detected in several types of cancers, including melanoma, gastric, kidney, esophageal, and lung cancers 46 .
Furthermore, gene inactivation experiments in human cell lines have shown that BRF2 is essential for cell proliferation 6 . However, the effect of BRF2 loss-of-function sequence variation in living organisms is unknown. Such mutations have not been linked to severe clinical phenotypes in humans and are yet to be introduced in mice models.
In our data, we do not observe a significant excess of reported miscarriage in 58 pregnancies among the 19 carrier couples compared to non-carrier couples matched on year of birth and number of pregnancies (P = 0.19) (  GTF2H3 encodes a core subunit of the TFIIH basal transcription factor that regulates RNA polymerase II transcription and is involved in nucleotide excision repair 47 55 . The RPAP2 gene has not been targeted in mice but has been shown to be essential for cell proliferation in human cell lines (Table 1).
Interestingly, a splice variant in RPAP2 with a carrier frequency of 21% in a purebred cattle population shows a complete homozygous deficit due to early embryonic lethality. This variant is also associated with a strong negative effect on reproduction in the same population 59 (Supplementary Data 20).

CASP9
We observe a significant deficit of individuals homozygous for the missense variant c.His237Pro in CASP9, which encodes the cysteine-aspartic protease caspase 9. In the combined dataset, 11.9 c.His237Pro homozygotes are expected, and one is observed.
Casp9 is a pro-apoptotic protein that acts as a regulator of physiological cell death and degeneration of pathological tissues. More specifically, caspase 9 is an initiator caspase that can activate downstream effector caspases and trigger a signaling cascade that induces apoptotic cell death 60 . Experiments have shown that CASP9 knockout mice have an enlarged and malformed cerebrum as a result of reduced apoptosis during development, and a majority of them die perinatally. Furthermore, in vivo, deletion of CASP9 has been shown to inhibit activation of the downstream effector caspase, Casp3, and CASP9 knockout thymocytes show resistance to a number of apoptotic stimuli 61 .
The missense variant p.His237Pro disrupts the caspase active site that is highly conserved between the twelve caspase genes in the human genome 62 (Supplementary Figure 13). A prior report has described a family with recurrent, folate-resistant, neural tube defects where two affected fetuses were heterozygous for both c.His237Pro and c.924dupT, a frameshift variant in CASP9 63 . In vitro experiments of cells transfected with the missense variant have shown that the mutation impairs the cellular response to apoptosis and that mutant Casp9 protein has a dominant-negative effect on wild-type Casp9 64 .

MTG2
In our combined population set of Iceland, the UK, and Finland, a frameshift variant in The MTG2 gene encodes the mitochondrial ribosome-associated GTPase 2, also known as GTP-binding protein 5 (GTPBP5). GTPBP5 is involved in the assembly of the mitochondrial ribosome 55S by facilitating modification of the 16S mt-rRNA 65,66 , which is part of the 39S large subunit. MTG2 knockouts using gene editing in HEK293 cell lines have been shown to lead to severely affected oxidative phosphorylation in mitochondria, reduction in the synthesis rate of mtDNA-encoded proteins, and decreased 55S monosome formation 65 . Mtg2 mouse knockouts show embryonic lethality in early gestation (E9.5) 7 .